The effectiveness of query-specific hierarchic clustering in information retrieval

نویسندگان

  • Anastasios Tombros
  • Robert Villa
  • C. J. van Rijsbergen
چکیده

Hierarchic document clustering has been widely applied to Information Retrieval (IR) on the grounds of its potential improved effectiveness over inverted file search. However, previous research has been inconclusive as to whether clustering does bring improvements. In this paper we take the view that if hierarchic clustering is applied to search results (query-specific clustering), then it has the potential to increase the retrieval effectiveness compared both to that of static clustering and of conventional inverted file search. We conducted a number of experiments using five document collections and four hierarchic clustering methods. Our results show that the effectiveness of query-specific clustering is indeed higher, and suggest that there is scope for its application to IR.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The effectiveness of query-based hierarchic clustering of documents for information retrieval

................................................................................................................................................... i ACKNOWLEDGEMENTS......................................................................................................................... iii LIST OF FIGURES ............................................................................................

متن کامل

Comparison of Different Distance Measures on Hierarchical Document Clustering in 2-Pass Retrieval

Hierarchic document clustering has been applied to search results (query-specific clustering ) on the grounds of its potential improved effectiveness compared both to that of static clustering and of conventional inverted file search (IFS). In this paper we review and compare the effects of seven different measures of similarity among documents in hierarchic query specific clustering. We have c...

متن کامل

Improving Quality of Clustering using Cellular Automata for Information retrieval

Clustering has been widely applied to Information Retrieval (IR) on the grounds of its potential improved effectiveness over inverted file search. Clustering is a mostly unsupervised procedure and the majority of the clustering algorithms depend on certain assumptions in order to define the subgroups present in a data set .A clustering quality measure is a function that, given a data set and it...

متن کامل

A Hierarchic Architecture for Conceptual Information Retrieval

Conceptual retrieval returns information related to a speci c topic but not restricted to a query term A common approach is to compare the query with all the documents in the database When the number of documents is large the searching time becomes signi cant In this paper we propose a hierarchic architecture which integrates latent semantic indexing LSI and hierarchic agglomerative clustering ...

متن کامل

QEA: A New Systematic and Comprehensive Classification of Query Expansion Approaches

A major problem in information retrieval is the difficulty to define the information needs of user and on the other hand, when user offers your query there is a vast amount of information to retrieval. Different methods , therefore, have been suggested for query expansion which concerned with reconfiguring of query by increasing efficiency and improving the criterion accuracy in the information...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Process. Manage.

دوره 38  شماره 

صفحات  -

تاریخ انتشار 2002